Search CORE

30 research outputs found

Further advantages of data augmentation on convolutional neural networks

Author: DC Ciresan
J Lemley
N Srivastava
VN Vapnik
X Lu
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/06/2019
Field of study

Data augmentation is a popular technique largely used to enhance the training of convolutional neural networks. Although many of its benefits are well known by deep learning researchers and practitioners, its implicit regularization effects, as compared to popular explicit regularization techniques, such as weight decay and dropout, remain largely unstudied. As a matter of fact, convolutional neural networks for image object classification are typically trained with both data augmentation and explicit regularization, assuming the benefits of all techniques are complementary. In this paper, we systematically analyze these techniques through ablation studies of different network architectures trained with different amounts of training data. Our results unveil a largely ignored advantage of data augmentation: networks trained with just data augmentation more easily adapt to different architectures and amount of training data, as opposed to weight decay and dropout, which require specific fine-tuning of their hyperparameters.Comment: Preprint of the manuscript accepted for presentation at the International Conference on Artificial Neural Networks (ICANN) 2018. Best Paper Awar

arXiv.org e-Print Archive

Crossref

Hardening against adversarial examples with the smooth gradient method

Author: Alan Mosca
DC Ciresan
DE Rumelhart
George D. Magoulas
R Miikkulainen
RH Hahnloser
Y Ganin
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2018
Field of study

Commonly used methods in deep learning do not utilise transformations of the residual gradient available at the inputs to update the representation in the dataset. It has been shown that this residual gradient, which can be interpreted as the first-order gradient of the input sensitivity at a particular point, may be used to improve generalisation in feed-forward neural networks, including fully connected and convolutional layers. We explore how these input gradients are related to input perturbations used to generate adversarial examples and how the networks that are trained with this technique are more robust to attacks generated with the fast gradient sign method

Crossref

Birkbeck Institutional Research Online

A multi-biometric iris recognition system based on a deep learning approach

Author: A Das
A Kumar
A Ross
AA Kerim
Alaa S. Al-Waisy
AR Syafeeza
B Abibullaev
C Li
D Menotti
DC Ciresan
F Jan
H AlMahafzah
H Khiyari El
H Mehrotra
H Proenc
J Duchi
JG Daugman
K Hajari
K Roy
L Deng
L Masek
M Elgamal
M Mahlouji
M Vatsa
MK Pawar
MM Monwar
N Srivastava
PR Nalla
R Duda
R Gad
R Hentati
R Salakhutdinov
R Zeng
Rami Qahwaji
RH Abiyev
RM Costa Da
RP Wildes
S Ding
S Lim
S Umer
SA Sahmoud
Shumoos Al-Fahdawi
SS Dhage
Stanley Ipson
T Tan
Tarek A. M. Nagem
WW Boles
X Ren
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/10/2017
Field of study

YesMultimodal biometric systems have been widely applied in many real-world applications due to its ability to deal with a number of significant limitations of unimodal biometric systems, including sensitivity to noise, population coverage, intra-class variability, non-universality, and vulnerability to spoofing. In this paper, an efficient and real-time multimodal biometric system is proposed based on building deep learning representations for images of both the right and left irises of a person, and fusing the results obtained using a ranking-level fusion method. The trained deep learning system proposed is called IrisConvNet whose architecture is based on a combination of Convolutional Neural Network (CNN) and Softmax classifier to extract discriminative features from the input image without any domain knowledge where the input image represents the localized iris region and then classify it into one of N classes. In this work, a discriminative CNN training scheme based on a combination of back-propagation algorithm and mini-batch AdaGrad optimization method is proposed for weights updating and learning rate adaptation, respectively. In addition, other training strategies (e.g., dropout method, data augmentation) are also proposed in order to evaluate different CNN architectures. The performance of the proposed system is tested on three public datasets collected under different conditions: SDUMLA-HMT, CASIA-Iris- V3 Interval and IITD iris databases. The results obtained from the proposed system outperform other state-of-the-art of approaches (e.g., Wavelet transform, Scattering transform, Local Binary Pattern and PCA) by achieving a Rank-1 identification rate of 100% on all the employed databases and a recognition time less than one second per person

Crossref

Bradford Scholars